{
 "cells": [
  {
   "cell_type": "raw",
   "id": "fd3395f2",
   "metadata": {},
   "source": [
    "Copyright 2023 Google LLC\n",
    "\n",
    "Licensed under the Apache License, Version 2.0 (the \"License\");\n",
    "you may not use this file except in compliance with the License.\n",
    "You may obtain a copy of the License at\n",
    "\n",
    "    http://www.apache.org/licenses/LICENSE-2.0\n",
    "\n",
    "Unless required by applicable law or agreed to in writing, software\n",
    "distributed under the License is distributed on an \"AS IS\" BASIS,\n",
    "WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
    "See the License for the specific language governing permissions and\n",
    "limitations under the License."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9d0817cd-9e64-4d5e-9c66-4e0961aa1085",
   "metadata": {
    "tags": []
   },
   "source": [
    "# MLOps End to End Workflow (III)\n",
    "\n",
    "Implementation of an end-to-end ML Ops workflow for the use case to detect fraudulent credit card transactions, see [Kaggle dataset](https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud).\n",
    "\n",
    "This set of notebooks cover:\n",
    "\n",
    "[Experimentation](01-experimentation.ipynb):\n",
    "1. Set up: Creation of the Vertex Dataset, extraction of the schema\n",
    "2. Implementation of a TFX pipeline\n",
    "\n",
    "[CICD](02-cicd.ipynb):\n",
    "\n",
    "3. Deployment of the Vertex AI Pipeline through a CI/CD process\n",
    "4. Deployment of a Continuous Training pipeline that can be triggered via Pub/Sub and produces a model in the Model Registry\n",
    "5. Deployment of the Inference Pipeline consisting of a Cloud Function that retrieves features from Feature Store and calls the model on a Vertex AI Endpoint\n",
    "6. Deployment of the model to a Vertex AI Endpoint through a CI/CD process.\n",
    "\n",
    "[Prediction](03-prediction.ipynb):\n",
    "\n",
    "7. Deploy the model to an endpoint\n",
    "8. Create a test prediction\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f5c0f8ad-ca30-4f5c-a135-809409f58abd",
   "metadata": {
    "tags": []
   },
   "source": [
    "### Configuration"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2b997d6a-f8fd-43f5-bafd-81b169965160",
   "metadata": {},
   "outputs": [],
   "source": [
    "#%load_ext autoreload\n",
    "#%autoreload 2\n",
    "\n",
    "import os\n",
    "import pandas as pd\n",
    "import tensorflow as tf\n",
    "import tensorflow_data_validation as tfdv\n",
    "from google.cloud import bigquery\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "from google.cloud import aiplatform as vertex_ai\n",
    "\n",
    "import yaml\n",
    "import os\n",
    "\n",
    "with open('mainconfig.yaml') as f:\n",
    "    main_config = yaml.safe_load(f)\n",
    "\n",
    "# select your config    \n",
    "main_config = main_config['creditcards']\n",
    "\n",
    "PROJECT = main_config['project'] \n",
    "REGION = main_config['region'] \n",
    "DOCKER_REPO = main_config['docker_repo']\n",
    "\n",
    "SERVICE_ACCOUNT = main_config['service_account']\n",
    "\n",
    "# BigQuery and data locations\n",
    "\n",
    "BQ_SOURCE_TABLE= main_config['bq']['source_table'] # raw input\n",
    "ML_TABLE = main_config['bq']['ml_table'] # the one we will use for the training\n",
    "\n",
    "BQ_DATASET_NAME = main_config['bq']['dataset']\n",
    "BQ_LOCATION = main_config['bq']['location'] # multiregion provides more resilience\n",
    "\n",
    "VERTEX_DATASET_NAME = main_config['vertex_dataset_name']\n",
    "\n",
    "RAW_SCHEMA_DIR = main_config['raw_schema_dir']\n",
    "\n",
    "BUCKET =  main_config['bucket']\n",
    "\n",
    "# TFX and model config\n",
    "\n",
    "# model version\n",
    "VERSION = main_config['version']\n",
    "\n",
    "\n",
    "MODEL_DISPLAY_NAME = f'{VERTEX_DATASET_NAME}-classifier-{VERSION}'\n",
    "WORKSPACE = f'gs://{BUCKET}/{VERTEX_DATASET_NAME}'\n",
    "\n",
    "MLMD_SQLLITE = 'mlmd.sqllite'\n",
    "ARTIFACT_STORE = os.path.join(WORKSPACE, 'tfx_artifacts_interactive')\n",
    "MODEL_REGISTRY = os.path.join(WORKSPACE, 'model_registry')\n",
    "PIPELINE_NAME = f'{MODEL_DISPLAY_NAME}-train-pipeline'\n",
    "PIPELINE_ROOT = os.path.join(ARTIFACT_STORE, PIPELINE_NAME)\n",
    "\n",
    "ENDPOINT_DISPLAY_NAME = f'{VERTEX_DATASET_NAME}-classifier'\n",
    "\n",
    "FEATURESTORE_ID = main_config['featurestore_id']\n",
    "\n",
    "CF_REGION = main_config['cloudfunction_region']\n",
    "\n",
    "DATAFLOW_SUBNETWORK = f\"https://www.googleapis.com/compute/v1/projects/{PROJECT}/regions/{REGION}/subnetworks/{main_config['dataflow']['subnet']}\"\n",
    "DATAFLOW_SERVICE_ACCOUNT = main_config['dataflow']['service_account']\n",
    "\n",
    "CLOUDBUILD_SA = f'projects/{PROJECT}/serviceAccounts/{SERVICE_ACCOUNT}'\n",
    "\n",
    "LIMIT=main_config['limit']\n",
    "\n",
    "print(\"Project ID:\", PROJECT)\n",
    "print(\"Region:\", REGION)\n",
    "print(\"Service Account:\", SERVICE_ACCOUNT)\n",
    "\n",
    "vertex_ai.init(\n",
    "    project=PROJECT,\n",
    "    location=REGION\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bb98a921-cba7-46ee-baa7-b0a4461c3b2c",
   "metadata": {},
   "outputs": [],
   "source": [
    "os.environ[\"VERTEX_DATASET_NAME\"] = VERTEX_DATASET_NAME\n",
    "os.environ[\"MODEL_DISPLAY_NAME\"] = MODEL_DISPLAY_NAME\n",
    "os.environ[\"PIPELINE_NAME\"] = PIPELINE_NAME\n",
    "os.environ[\"PROJECT\"] = PROJECT\n",
    "os.environ['GOOGLE_CLOUD_PROJECT'] = PROJECT\n",
    "os.environ[\"REGION\"] = REGION\n",
    "os.environ[\"GCS_LOCATION\"] = f\"gs://{BUCKET}/{VERTEX_DATASET_NAME}\"\n",
    "os.environ[\"SERVICE_ACCOUNT\"] = DATAFLOW_SERVICE_ACCOUNT\n",
    "os.environ[\"GCS_LOCATION\"] = f\"gs://{BUCKET}/{VERTEX_DATASET_NAME}/e2e_tests\"\n",
    "os.environ[\"SUBNETWORK\"] = DATAFLOW_SUBNETWORK"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9ad9e768-4c25-4765-aeac-9e47f6c6cc70",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "!python -m pytest src/tests/pipeline_deployment_tests.py::test_e2e_pipeline -s"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "90774f24-3a81-4b6b-89ef-8dcd08387fdb",
   "metadata": {},
   "outputs": [],
   "source": [
    "config.BEAM_RUNNER"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a271d4e3-d45d-487f-972d-0b4e06db7758",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from src.tfx_pipelines import config\n",
    "import importlib\n",
    "\n",
    "importlib.reload(config)\n",
    "\n",
    "for key, value in config.__dict__.items():\n",
    "    if key.isupper(): print(f'{key}: {value}')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c4d086a8-cd73-403f-92ed-6cd8d68321d8",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Repo should has been created in the Terraform automation stage\n",
    "#! gcloud artifacts repositories create {VERTEX_DATASET_NAME} --location={REGION} --repository-format=docker"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "05123e22-d8f8-4cfd-a743-fe8c6e4925d9",
   "metadata": {},
   "outputs": [],
   "source": [
    "# You can also use build/Dockerfile.dataflow in case Internet access is not allowed\n",
    "!cp build/Dockerfile.dataflow_internet Dockerfile"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bc17291e-d63f-4e13-bbd7-5de2510c87f1",
   "metadata": {},
   "outputs": [],
   "source": [
    "os.environ[\"DOCKER_REPO\"] = f\"{DOCKER_REPO}/vertex:latest\"\n",
    "!gcloud builds submit --project=$PROJECT --billing-project=$PROJECT --region $REGION --tag $DOCKER_REPO/dataflow:latest . --timeout=15m --machine-type=e2-highcpu-8 --suppress-logs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2595e5e8-47bc-4f6d-8a39-b09d92588be2",
   "metadata": {},
   "outputs": [],
   "source": [
    "!echo $TFX_IMAGE_URI"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "12aff000-b7b8-4f1a-a2b7-2d4f56343a9e",
   "metadata": {},
   "outputs": [],
   "source": [
    "!echo $PYTHONPATH"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a12ebc0c-03ed-46eb-9610-9f4c24f60cce",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "!cp build/Dockerfile.vertex_internet Dockerfile\n",
    "!gcloud builds submit --project=$PROJECT --billing-project=$PROJECT --region $REGION --tag $TFX_IMAGE_URI . --timeout=15m --machine-type=e2-highcpu-8 --suppress-logs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6e73afd4-c486-48b2-ba98-fe5d5bd56ceb",
   "metadata": {},
   "outputs": [],
   "source": [
    "PIPELINES_STORE = f\"gs://{BUCKET}/{VERTEX_DATASET_NAME}/compiled_pipelines/\"\n",
    "!gsutil cp {pipeline_definition_file} {PIPELINES_STORE}\n",
    "PIPELINES_STORE"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e876c9eb-6f73-4ea9-a85c-0172c728c1d3",
   "metadata": {},
   "outputs": [],
   "source": [
    "from src.tfx_pipelines import config, runner\n",
    "\n",
    "pipeline_definition_file = f'{config.PIPELINE_NAME}.json'\n",
    "pipeline_definition = runner.compile_training_pipeline(pipeline_definition_file)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8516b764-1143-4b9e-9ae7-af9340146ec3",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from google.cloud.aiplatform import pipeline_jobs\n",
    "    \n",
    "job = pipeline_jobs.PipelineJob(template_path = pipeline_definition_file,\n",
    "                                display_name=VERTEX_DATASET_NAME,\n",
    "                                #enable_caching=False,\n",
    "                                parameter_values={\n",
    "                                    'learning_rate': 0.003,\n",
    "                                    'batch_size': 512,\n",
    "                                    'steps_per_epoch': int(config.TRAIN_LIMIT) // 512,\n",
    "                                    'hidden_units': '128,128',\n",
    "                                    'num_epochs': 30,\n",
    "                                })\n",
    "\n",
    "job.run(sync=False, service_account=DATAFLOW_SERVICE_ACCOUNT)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e504cc07-22ee-45ba-9416-73cd22251c01",
   "metadata": {},
   "outputs": [],
   "source": [
    "CICD_IMAGE_URI = f\"{DOCKER_REPO}/cicd:latest\"\n",
    "CICD_IMAGE_URI"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a38e3ffc-fd4e-4934-b2b2-995070ae2076",
   "metadata": {},
   "outputs": [],
   "source": [
    "# For the CICD container, we are just adding the build/* dir\n",
    "!cp build/Dockerfile.cicd_internet build/Dockerfile\n",
    "!gcloud builds submit --project=$PROJECT --billing-project=$PROJECT --region $REGION --tag $CICD_IMAGE_URI build/. --timeout=15m --machine-type=e2-highcpu-8 --suppress-logs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "370f9daf-7602-48d7-92b4-0f265ba9bee9",
   "metadata": {},
   "outputs": [],
   "source": [
    "REPO_URL = main_config['git']['repo_url']\n",
    "BRANCH = main_config['git']['branch']\n",
    "\n",
    "\n",
    "GCS_LOCATION = f\"gs://{BUCKET}/{VERTEX_DATASET_NAME}/\"\n",
    "TEST_GCS_LOCATION = f\"gs://{BUCKET}/{VERTEX_DATASET_NAME}/e2e_tests\"\n",
    "CI_TRAIN_LIMIT = 1000\n",
    "CI_TEST_LIMIT = 100\n",
    "CI_UPLOAD_MODEL = 0\n",
    "CI_ACCURACY_THRESHOLD = -0.1 # again setting accuracy threshold to negative\n",
    "BEAM_RUNNER = \"DataflowRunner\"\n",
    "TRAINING_RUNNER = \"vertex\"\n",
    "VERSION = 'latest'\n",
    "PIPELINE_NAME = f'{MODEL_DISPLAY_NAME}-train-pipeline'\n",
    "PIPELINES_STORE = f\"{GCS_LOCATION}compiled_pipelines/\"\n",
    "\n",
    "TFX_IMAGE_URI = f\"{DOCKER_REPO}/vertex:{VERSION}\"\n",
    "DATAFLOW_IMAGE_URI = f\"{DOCKER_REPO}/dataflow:latest\"\n",
    "\n",
    "REPO_NAME = REPO_URL.split('/')[-1]\n",
    "DESCR=f'\"Deploy train pipeline to GCS from {BRANCH}\"'\n",
    "\n",
    "\n",
    "SUBSTITUTIONS=f\"\"\"\\\n",
    "_REPO_URL='{REPO_URL}',\\\n",
    "_BRANCH={BRANCH},\\\n",
    "_CICD_IMAGE_URI={CICD_IMAGE_URI},\\\n",
    "_PROJECT={PROJECT},\\\n",
    "_REGION={REGION},\\\n",
    "_GCS_LOCATION={GCS_LOCATION},\\\n",
    "_TEST_GCS_LOCATION={TEST_GCS_LOCATION},\\\n",
    "_BQ_LOCATION={BQ_LOCATION},\\\n",
    "_BQ_DATASET_NAME={BQ_DATASET_NAME},\\\n",
    "_ML_TABLE={ML_TABLE},\\\n",
    "_VERTEX_DATASET_NAME={VERTEX_DATASET_NAME},\\\n",
    "_MODEL_DISPLAY_NAME={MODEL_DISPLAY_NAME},\\\n",
    "_CI_TRAIN_LIMIT={CI_TRAIN_LIMIT},\\\n",
    "_CI_TEST_LIMIT={CI_TEST_LIMIT},\\\n",
    "_CI_UPLOAD_MODEL={CI_UPLOAD_MODEL},\\\n",
    "_CI_ACCURACY_THRESHOLD={CI_ACCURACY_THRESHOLD},\\\n",
    "_BEAM_RUNNER={BEAM_RUNNER},\\\n",
    "_TRAINING_RUNNER={TRAINING_RUNNER},\\\n",
    "_DATAFLOW_IMAGE_URI={DATAFLOW_IMAGE_URI},\\\n",
    "_TFX_IMAGE_URI={TFX_IMAGE_URI},\\\n",
    "_PIPELINE_NAME={PIPELINE_NAME},\\\n",
    "_PIPELINES_STORE={PIPELINES_STORE},\\\n",
    "_SUBNETWORK={DATAFLOW_SUBNETWORK},\\\n",
    "_GCS_BUCKET={BUCKET}/cloudbuild,\\\n",
    "_SERVICE_ACCOUNT={DATAFLOW_SERVICE_ACCOUNT},\\\n",
    "_WORKDIR={REPO_NAME}\\\n",
    "\"\"\"\n",
    "!echo $SUBSTITUTIONS"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7856ce61-fe1a-4db4-9c71-fed47129ef7d",
   "metadata": {},
   "outputs": [],
   "source": [
    "!gcloud builds submit build/known_hosts.github.zip --config build/pipeline-deployment.yaml --substitutions {SUBSTITUTIONS} --project=$PROJECT --billing-project=$PROJECT --region $REGION --suppress-logs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "928797f7-20cf-4769-9a35-24ad67900125",
   "metadata": {},
   "outputs": [],
   "source": [
    "!echo gcloud beta builds triggers create cloud-source-repositories --repo={REPO_NAME} --branch-pattern=^{BRANCH}$ --description={DESCR} --build-config=mlops-creditcard/build/pipeline-deployment.yaml --substitutions={SUBSTITUTIONS} --billing-project={PROJECT}  --service-account={TRIGGER_SA}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "64f56b2a-e55f-4b54-b90a-6335fce69314",
   "metadata": {},
   "outputs": [],
   "source": [
    "PUBSUB_TOPIC = f'trigger-{PIPELINE_NAME}'\n",
    "CLOUD_FUNCTION_NAME = f'trigger-{PIPELINE_NAME}-fn'\n",
    "GCS_PIPELINE_FILE_LOCATION = os.path.join(PIPELINES_STORE, f'{PIPELINE_NAME}.json')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ee8e210a-cca3-41e5-9083-ff3a395b9c1a",
   "metadata": {},
   "source": [
    "#### Create Pub/Sub Topic"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a03ef492-ee1c-451d-a1e2-c5c511e00326",
   "metadata": {},
   "outputs": [],
   "source": [
    "!gcloud pubsub topics create {PUBSUB_TOPIC}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "547456c0-c55f-492e-97e6-f529182d3a13",
   "metadata": {},
   "source": [
    "#### Deploy Cloud Function"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0d094091-3332-4c4c-9bb6-f6e5839d7ded",
   "metadata": {},
   "outputs": [],
   "source": [
    "ENV_VARS=f\"\"\"\\\n",
    "PROJECT={PROJECT},\\\n",
    "REGION={REGION},\\\n",
    "GCS_PIPELINE_FILE_LOCATION={GCS_PIPELINE_FILE_LOCATION},\\\n",
    "SERVICE_ACCOUNT={SERVICE_ACCOUNT},\\\n",
    "PIPELINE_NAME={PIPELINE_NAME}\n",
    "\"\"\"\n",
    "\n",
    "!echo {ENV_VARS}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3da074b8-106c-42bf-b956-e59b249bf5f2",
   "metadata": {},
   "outputs": [],
   "source": [
    "!rm -rf src/pipeline_triggering/.ipynb_checkpoints"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7a695eef-bf18-4fce-b44c-2ac00ae46675",
   "metadata": {},
   "outputs": [],
   "source": [
    "!gcloud functions deploy {CLOUD_FUNCTION_NAME} --gen2 \\\n",
    "    --region={CF_REGION} \\\n",
    "    --trigger-topic={PUBSUB_TOPIC} \\\n",
    "    --runtime=python38 \\\n",
    "    --source=src/pipeline_triggering\\\n",
    "    --entry-point=trigger_pipeline\\\n",
    "    --stage-bucket={BUCKET}\\\n",
    "    --ingress-settings=internal-only\\\n",
    "    --service-account={SERVICE_ACCOUNT}\\\n",
    "    --update-env-vars={ENV_VARS} "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "50ad5728-1b82-4979-bf16-fc518f8a9ba9",
   "metadata": {},
   "source": [
    "#### Test triggering the pipeline with a Pub/Sub message"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ead71b7d-fc63-4a80-8f52-e40f3612d67a",
   "metadata": {},
   "outputs": [],
   "source": [
    "from google.cloud import pubsub\n",
    "import json\n",
    "\n",
    "publish_client = pubsub.PublisherClient()\n",
    "topic = f'projects/{PROJECT}/topics/{PUBSUB_TOPIC}'\n",
    "data = {\n",
    "    'num_epochs': 7,\n",
    "    'learning_rate': 0.0015,\n",
    "    'batch_size': 512,\n",
    "    'steps_per_epoch': int(config.TRAIN_LIMIT) // 512,\n",
    "    'hidden_units': '256,126'\n",
    "}\n",
    "message = json.dumps(data)\n",
    "\n",
    "_ = publish_client.publish(topic, message.encode())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "65a666dc-d06c-4442-875e-86d382e3269f",
   "metadata": {},
   "source": [
    "Check the console to see that it's running.\n",
    "\n",
    "We now have:\n",
    "\n",
    "* A pipeline that can be run to test and deploy new training pipelines\n",
    "* A triggering mechanism to programmatically trigger new training runs\n",
    "* A training run finish with a new model in the Vertex AI Model Registry\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6d67b248-7ccb-4159-9ef2-2238455737f7",
   "metadata": {},
   "source": [
    "## Deploy the model\n",
    "\n",
    "Preparation:\n",
    "\n",
    "* Create a Vertex AI Endpoint\n",
    "* Create a Vertex AI Feature Store and upload some feature data\n",
    "* Deploy a Cloud Function that receives prediction requests, pull features from the Feature Store, call the Endpoint and returns a prediction\n",
    "\n",
    "After this, we will deploy a model deployment pipeline, that will:\n",
    "\n",
    "* Test the model locally\n",
    "* Create the Endpoint if necessary\n",
    "* Deploy the model to the Endpoint\n",
    "* Test the model on the Endpoint\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "99ff68dd-b3c4-4633-88fb-5997636d15ec",
   "metadata": {},
   "source": [
    "### Preparation\n",
    "\n",
    "#### Vertex AI Endpoint"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f1793acf-21df-495f-b37b-99a54ec38cdd",
   "metadata": {},
   "outputs": [],
   "source": [
    "ENDPOINT_DISPLAY_NAME"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1088589e-d24a-47a8-9fe4-7a4135a8a5b4",
   "metadata": {},
   "outputs": [],
   "source": [
    "from build.utils import create_endpoint\n",
    "\n",
    "endpoint = create_endpoint(PROJECT, REGION, ENDPOINT_DISPLAY_NAME)\n",
    "endpoint"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "603eb006-ae75-48ca-96d5-5cc22a786bc6",
   "metadata": {},
   "source": [
    "### Model Deployment Pipeline\n",
    "\n",
    "#### Run the model artifact testing\n",
    "\n",
    "Artifact testing requires that the model is deployed to the Vertex AI Model Registry."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a394b2a9-aecd-4ec4-b0b7-f7ab8d2a5c67",
   "metadata": {},
   "outputs": [],
   "source": [
    "os.environ[\"PROJECT\"] = PROJECT\n",
    "os.environ[\"REGION\"] = REGION\n",
    "os.environ[\"SERVICE_ACCOUNT\"] = SERVICE_ACCOUNT\n",
    "os.environ['ENDPOINT_DISPLAY_NAME'] = ENDPOINT_DISPLAY_NAME\n",
    "os.environ[\"MODEL_DISPLAY_NAME\"] = MODEL_DISPLAY_NAME"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "073e0efb-5ae3-4bda-b1cd-d6cc08f9b64c",
   "metadata": {},
   "outputs": [],
   "source": [
    "!python -m pytest src/tests/model_deployment_tests.py::test_model_artifact -s"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0c5a1d30-8dad-4f0a-854b-4de67c548d54",
   "metadata": {},
   "source": [
    "#### Deploy Model to Endpoint"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "697f0915-7257-41e0-bf33-2511671a439e",
   "metadata": {},
   "outputs": [],
   "source": [
    "!python build/utils.py \\\n",
    "    --mode=deploy-model\\\n",
    "    --project={PROJECT}\\\n",
    "    --region={REGION}\\\n",
    "    --endpoint-display-name={ENDPOINT_DISPLAY_NAME}\\\n",
    "    --model-display-name={MODEL_DISPLAY_NAME}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a402a585-5d45-4d9d-9497-cef77effb8a3",
   "metadata": {},
   "source": [
    "#### Test model on Endpoint"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "062b7c0f-9a18-44cc-800e-a203903df6dd",
   "metadata": {},
   "outputs": [],
   "source": [
    "!python -m pytest src/tests/model_deployment_tests.py::test_model_endpoint"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3a23c981-a4d2-4396-aa1f-3785244a81ca",
   "metadata": {},
   "source": [
    "#### Run the pipeline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "01a0a985-2955-4c33-9cbc-09fa630a1ca2",
   "metadata": {},
   "outputs": [],
   "source": [
    "REPO_URL = main_config['git']['repo_url']\n",
    "BRANCH = main_config['git']['branch']\n",
    "REPO_NAME = REPO_URL.split('/')[-1]\n",
    "CICD_IMAGE_URI = f\"{DOCKER_REPO}/cicd:latest\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6453e104-774d-475d-8159-328ea6ae0249",
   "metadata": {},
   "outputs": [],
   "source": [
    "SUBSTITUTIONS=f\"\"\"\\\n",
    "_REPO_URL='{REPO_URL}',\\\n",
    "_BRANCH={BRANCH},\\\n",
    "_CICD_IMAGE_URI={CICD_IMAGE_URI},\\\n",
    "_PROJECT={PROJECT},\\\n",
    "_REGION={REGION},\\\n",
    "_MODEL_DISPLAY_NAME={MODEL_DISPLAY_NAME},\\\n",
    "_ENDPOINT_DISPLAY_NAME={ENDPOINT_DISPLAY_NAME},\\\n",
    "_GCS_BUCKET={BUCKET}/cloudbuild,\\\n",
    "_SERVICE_ACCOUNT={SERVICE_ACCOUNT},\\\n",
    "_WORKDIR={REPO_NAME}\\\n",
    "\"\"\"\n",
    "\n",
    "SUBSTITUTIONS"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5cd627e0-aec5-4ff0-8bb3-b072aa01fc6d",
   "metadata": {},
   "source": [
    "### Test the build and define a manual trigger"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fa8712b3-b431-4c82-b49f-2df4882c25d4",
   "metadata": {},
   "outputs": [],
   "source": [
    "!echo gcloud builds submit --no-source --config build/model-deployment.yaml --substitutions {SUBSTITUTIONS} --billing-project {PROJECT} --suppress-logs --async"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fb1f6fae-4065-4d4e-bcb1-16585125f00d",
   "metadata": {},
   "outputs": [],
   "source": [
    "DESCR=f'\"Deploy model from branch {BRANCH}\"'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "34927767-4aea-43b8-a0cb-df433dc591cb",
   "metadata": {},
   "outputs": [],
   "source": [
    "!echo gcloud alpha builds triggers create manual --repo={REPO_URL} --repo-type=CLOUD_SOURCE_REPOSITORIES --branch={BRANCH} --description={DESCR} --build-config=mlops-creditcard/build/model-deployment.yaml --substitutions={SUBSTITUTIONS} --billing-project={PROJECT} --service-account={CLOUDBUILD_SA}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1d211c2e-8b5f-4511-845f-a96cefc7b274",
   "metadata": {},
   "source": [
    "## Deploy Prediction Cloud Function\n",
    "\n",
    "The Cloud Function that performs the final prediction has to:\n",
    "    \n",
    "* Receive the features from the prediction request: `V1`, `V2`, ..., `V26`, `Amount`, `userid`\n",
    "* Use the `userid` to retrieve the features `V27` and `V28` from the Feature Store\n",
    "* Query the model on the Vertex AI Endpint\n",
    "* Return the prediction\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "id": "420e70e8-b29a-4fc0-982f-8c6f1121d426",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Making request: POST https://oauth2.googleapis.com/token\n",
      "Starting new HTTPS connection (1): oauth2.googleapis.com:443\n",
      "https://oauth2.googleapis.com:443 \"POST /token HTTP/1.1\" 200 None\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'predict-creditcards-classifier-v02-train-pipeline-fn'"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "endpoints = vertex_ai.Endpoint.list(\n",
    "    filter=f'display_name={ENDPOINT_DISPLAY_NAME}',\n",
    "    order_by=\"update_time\"\n",
    ")\n",
    "\n",
    "if len(endpoints) == 0:\n",
    "    print(f'No endpoints found with name {ENDPOINT_DISPLAY_NAME}')\n",
    "endpoint = endpoints[-1]\n",
    "\n",
    "os.environ['ENDPOINT_NAME'] = endpoint.name\n",
    "\n",
    "entity = 'user'\n",
    "os.environ['ENTITY'] = entity\n",
    "os.environ['FEATURESTORE_ID'] = FEATURESTORE_ID\n",
    "\n",
    "PREDICT_CLOUD_FUNCTION_NAME = \"predict-\" + PIPELINE_NAME + \"-fn\"\n",
    "PREDICT_CLOUD_FUNCTION_NAME"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "eefa0135-99f3-45f9-b79d-0f04e34c4ee7",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'V1': [-0.906611],\n",
       " 'V2': [-0.906611],\n",
       " 'V3': [-0.906611],\n",
       " 'V4': [-0.906611],\n",
       " 'V5': [-0.906611],\n",
       " 'V6': [-0.906611],\n",
       " 'V7': [-0.906611],\n",
       " 'V8': [-0.906611],\n",
       " 'V9': [-0.906611],\n",
       " 'V10': [-0.906611],\n",
       " 'V11': [-0.906611],\n",
       " 'V12': [-0.906611],\n",
       " 'V13': [-0.906611],\n",
       " 'V14': [-0.906611],\n",
       " 'V15': [-0.906611],\n",
       " 'V16': [-0.906611],\n",
       " 'V17': [-0.906611],\n",
       " 'V18': [-0.906611],\n",
       " 'V19': [-0.906611],\n",
       " 'V20': [-0.906611],\n",
       " 'V21': [-0.906611],\n",
       " 'V22': [-0.906611],\n",
       " 'V23': [-0.906611],\n",
       " 'V24': [-0.906611],\n",
       " 'V25': [-0.906611],\n",
       " 'V26': [-0.906611],\n",
       " 'V27': [-0.906611],\n",
       " 'V28': [-0.906611],\n",
       " 'Amount': [15.99]}"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from src.tests.model_deployment_tests import test_instance\n",
    "test_instance"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "72229678-bea2-44a2-b79d-74124882b847",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'scores': [0.5002141, 0.4997859], 'classes': ['legit', 'fraudulent']}\n"
     ]
    }
   ],
   "source": [
    "test_instances = [test_instance]\n",
    "predictions = endpoint.predict(test_instances).predictions\n",
    "\n",
    "for prediction in predictions:\n",
    "    print(prediction)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dc8de820-79c2-4d4e-bd74-4c4320e91149",
   "metadata": {},
   "outputs": [],
   "source": [
    "#In case model is deployed with explanations\n",
    "explanations = endpoint.explain(test_instances).explanations\n",
    "\n",
    "for explanation in explanations:\n",
    "    print(explanation)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "e613276f-a40a-4c4e-b250-c1c488c3331b",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'{\"instances\": [{\"V1\": [-0.906611], \"V2\": [-0.906611], \"V3\": [-0.906611], \"V4\": [-0.906611], \"V5\": [-0.906611], \"V6\": [-0.906611], \"V7\": [-0.906611], \"V8\": [-0.906611], \"V9\": [-0.906611], \"V10\": [-0.906611], \"V11\": [-0.906611], \"V12\": [-0.906611], \"V13\": [-0.906611], \"V14\": [-0.906611], \"V15\": [-0.906611], \"V16\": [-0.906611], \"V17\": [-0.906611], \"V18\": [-0.906611], \"V19\": [-0.906611], \"V20\": [-0.906611], \"V21\": [-0.906611], \"V22\": [-0.906611], \"V23\": [-0.906611], \"V24\": [-0.906611], \"V25\": [-0.906611], \"V26\": [-0.906611], \"V27\": [-0.906611], \"V28\": [-0.906611], \"Amount\": [15.99]}]}'"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import json\n",
    "\n",
    "request = {'instances': test_instances}\n",
    "\n",
    "REQ_JSON=json.dumps(request)\n",
    "REQ_JSON"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "id": "24b77ee8-31ca-4c19-8ab7-7a6a1f02e69a",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"predictions\": [\n",
      "    {\n",
      "      \"scores\": [\n",
      "        0.5002141,\n",
      "        0.4997859\n",
      "      ],\n",
      "      \"classes\": [\n",
      "        \"legit\",\n",
      "        \"fraudulent\"\n",
      "      ]\n",
      "    }\n",
      "  ],\n",
      "  \"deployedModelId\": \"8097568887035396096\",\n",
      "  \"model\": \"projects/80023113991/locations/europe-west4/models/7404419164699361280\",\n",
      "  \"modelDisplayName\": \"creditcards-classifier-v02\",\n",
      "  \"modelVersionId\": \"1\"\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "os.environ['PROJECT_ID'] = PROJECT\n",
    "os.environ['REGION'] = REGION\n",
    "os.environ['ENDPOINT_ID'] = endpoint.name\n",
    "os.environ['INPUT_DATA_FILE'] = \"INPUT-JSON\"\n",
    "os.environ['REQ_JSON'] = REQ_JSON\n",
    "!echo ${REQ_JSON} > ${INPUT_DATA_FILE}\n",
    "!curl -X POST -H \"Authorization: Bearer $(gcloud auth print-access-token)\"  -H \"Content-Type: application/json\"  https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/europe-west4/endpoints/${ENDPOINT_ID}:predict  -d \"@${INPUT_DATA_FILE}\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f5070311-5ad6-4afd-9baa-8420aaacd149",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Feature Store (Optional)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "02f89f3a-9456-4f5c-af84-256fd51e3cb6",
   "metadata": {},
   "source": [
    "### Create Feature Store"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7ff735bf-43a6-4fe1-b3c1-055b21860aa9",
   "metadata": {},
   "outputs": [],
   "source": [
    "from google.cloud.aiplatform_v1beta1 import FeaturestoreOnlineServingServiceClient, FeaturestoreServiceClient"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0e76b8fe-3b31-4722-b95a-3f9330e91530",
   "metadata": {},
   "outputs": [],
   "source": [
    "from feature_store import feature_store as fs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ad2103aa-5845-4f51-a982-942b491b3588",
   "metadata": {},
   "outputs": [],
   "source": [
    "fs.create_fs(PROJECT, REGION, FEATURESTORE_ID, \"Feature Store for credit card use case\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c531af74-593a-4439-9c5a-5a45d320d951",
   "metadata": {},
   "outputs": [],
   "source": [
    "from google.api_core import operations_v1\n",
    "from google.cloud.aiplatform_v1beta1 import FeaturestoreOnlineServingServiceClient, FeaturestoreServiceClient, FeatureSelector\n",
    "from google.cloud.aiplatform_v1beta1.types import featurestore_online_service as featurestore_online_service_pb2\n",
    "from google.cloud.aiplatform_v1beta1.types import entity_type as entity_type_pb2\n",
    "from google.cloud.aiplatform_v1beta1.types import feature as feature_pb2\n",
    "from google.cloud.aiplatform_v1beta1.types import featurestore as featurestore_pb2\n",
    "from google.cloud.aiplatform_v1beta1.types import featurestore_service as featurestore_service_pb2\n",
    "from google.cloud.aiplatform_v1beta1.types import io as io_pb2\n",
    "from google.cloud.aiplatform_v1beta1.types import ListFeaturestoresRequest, CreateFeaturestoreRequest, Featurestore, ListEntityTypesRequest\n",
    "\n",
    "from google.protobuf.timestamp_pb2 import Timestamp\n",
    "from google.cloud.aiplatform_v1beta1.types import featurestore_monitoring as featurestore_monitoring_pb2\n",
    "from google.protobuf.duration_pb2 import Duration\n",
    "\n",
    "\n",
    "API_ENDPOINT = f\"{REGION}-aiplatform.googleapis.com\"  \n",
    "admin_client = FeaturestoreServiceClient(client_options={\"api_endpoint\": API_ENDPOINT})\n",
    "parent = f'{admin_client.common_location_path(PROJECT, REGION)}/featurestores/{FEATURESTORE_ID}'\n",
    "request = ListEntityTypesRequest(parent=parent)\n",
    "\n",
    "# Make the request\n",
    "page_result = admin_client.list_entity_types(request=request)\n",
    "\n",
    "# Handle the response\n",
    "[x.name.split('/')[-1] for x in page_result]\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "813e1109-23c9-4df1-9dcd-49579abd4637",
   "metadata": {},
   "outputs": [],
   "source": [
    "admin_client.featurestore_path(PROJECT, REGION, FEATURESTORE_ID)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "857b2803-8136-4aae-9d32-370daf566826",
   "metadata": {},
   "source": [
    "#### Create an entity with features, generate some data and upload it"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4c5e3a9f-1c49-452b-b41d-54b9c766fac4",
   "metadata": {},
   "outputs": [],
   "source": [
    "entity = 'user'\n",
    "entity_descr = 'User ID'\n",
    "features = ['v27', 'v28']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "138a0401-bf30-4433-b6ce-3ce7908bfb63",
   "metadata": {},
   "outputs": [],
   "source": [
    "fs.create_entity(PROJECT, REGION, FEATURESTORE_ID, entity, entity_descr, features)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9b258871-b60a-4841-9d87-974e6b7591d2",
   "metadata": {},
   "outputs": [],
   "source": [
    "import random\n",
    "\n",
    "filename = f'features_{entity}.csv'\n",
    "\n",
    "with open(filename, 'w') as f:\n",
    "    line = f'{entity},{\",\".join(features)}\\n'\n",
    "    f.write(line)\n",
    "    for i in range(100):\n",
    "        f.write(f'user{i},{random.random()},{random.random()}\\n')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c643c891-971f-4fcc-aa32-608881e3ff9a",
   "metadata": {},
   "outputs": [],
   "source": [
    "!tail -20 {filename}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3245990d-362c-48bd-a006-a39ba04225f2",
   "metadata": {},
   "outputs": [],
   "source": [
    "BUCKET"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a3660a3c-a615-4c09-b9a9-329bc56f7be1",
   "metadata": {},
   "outputs": [],
   "source": [
    "!gsutil cp {filename} gs://{BUCKET}/{filename} "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "534dbb9a-f3d8-41f4-a6a7-d78dcc615ecc",
   "metadata": {},
   "outputs": [],
   "source": [
    "gcs_uris = [f'gs://{BUCKET}/{filename}']\n",
    "\n",
    "fs.ingest_entities_csv(PROJECT, REGION, FEATURESTORE_ID, entity, features, gcs_uris)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "920fd50e-7a4c-475d-9834-eafad695bacb",
   "metadata": {},
   "source": [
    "Test reading some features back"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "481ff3e3-cb67-476c-9e4b-a85fcfe1cf42",
   "metadata": {},
   "outputs": [],
   "source": [
    "features_data = {}\n",
    "for i in range(90,102):\n",
    "    entity_id = f'user{i}'\n",
    "    features_data[entity_id] = fs.read_features(PROJECT, REGION, FEATURESTORE_ID, entity, features, entity_id)\n",
    "\n",
    "features_data"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5a116aeb-0366-4876-91dc-5ccd4a81a402",
   "metadata": {
    "tags": []
   },
   "source": [
    "### Deploy Prediction Cloud Function to use with Feature Store\n",
    "\n",
    "The Cloud Function that performs the final prediction has to:\n",
    "    \n",
    "* Receive the features from the prediction request: `V1`, `V2`, ..., `V26`, `Amount`, `userid`\n",
    "* Use the `userid` to retrieve the features `V27` and `V28` from the Feature Store\n",
    "* Query the model on the Vertex AI Endpint\n",
    "* Return the prediction\n",
    "\n",
    "#### Test the enpoint with Feature store"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "101175fb-4cfc-42bf-8b47-87d9a19429ff",
   "metadata": {},
   "outputs": [],
   "source": [
    "from src.tests.model_deployment_tests import test_instance\n",
    "\n",
    "import base64\n",
    "\n",
    "if 'V27' in test_instance:\n",
    "    del test_instance['V27']\n",
    "if 'V28' in test_instance:\n",
    "    del test_instance['V28']\n",
    "test_instance['userid'] = 'user99'\n",
    "\n",
    "test_instance"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d7dbd8f2-46a2-4e58-80f0-c4f7f29b8bb7",
   "metadata": {},
   "outputs": [],
   "source": [
    "from flask import Flask\n",
    "from src.prediction_cf.main import predict\n",
    "\n",
    "app = Flask('test')\n",
    "ctx = app.test_request_context(json=test_instance)\n",
    "request = ctx.request\n",
    "\n",
    "pred_retval = predict(request)\n",
    "pred_retval"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d9a69bc0-dbd0-4278-9095-52a639202ece",
   "metadata": {},
   "outputs": [],
   "source": [
    "GOOGLE_FUNCTION_SOURCE ='src/prediction_cf/main.py'\n",
    "\n",
    "ENV_VARS=f\"\"\"\\\n",
    "PROJECT={PROJECT},\\\n",
    "REGION={REGION},\\\n",
    "ENDPOINT_NAME={endpoint.name},\\\n",
    "ENTITY={entity},\\\n",
    "FEATURESTORE_ID={FEATURESTORE_ID}\n",
    "\"\"\"\n",
    "\n",
    "!echo {ENV_VARS}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3db24101-466a-4a90-b8d9-fbf3539ab33b",
   "metadata": {},
   "outputs": [],
   "source": [
    "!rm -rf src/prediction_cf/.ipynb_checkpoints"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "09d1106d-ee2f-4e49-a4f0-ab89d6c1f77a",
   "metadata": {},
   "outputs": [],
   "source": [
    "!echo gcloud functions deploy {PREDICT_CLOUD_FUNCTION_NAME} --gen2 \\\n",
    "    --set-build-env-vars=GOOGLE_FUNCTION_SOURCE={GOOGLE_FUNCTION_SOURCE} \\\n",
    "    --region={CF_REGION} \\\n",
    "    --runtime=python38 \\\n",
    "    --trigger-http \\\n",
    "    --source=. \\\n",
    "    --entry-point=predict\\\n",
    "    --stage-bucket={BUCKET}\\\n",
    "    --ingress-settings=internal-only\\\n",
    "    --service-account={SERVICE_ACCOUNT}\\\n",
    "    --set-env-vars={ENV_VARS}      "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1c717710-888b-47d8-86c4-02eb6a7e96c5",
   "metadata": {
    "tags": []
   },
   "source": [
    "#### Test the prediction cloud function\n",
    "\n",
    "You can test it using a `curl` command, but this has to be executed from the same VPC that the Cloud Function is deployed in:\n",
    "\n",
    "```\n",
    "curl -m 70 -X POST https://PROJECT.cloudfunctions.net/predict-creditcards-classifier-v01-train-pipeline-fn \\\n",
    "-H \"Authorization:bearer $(gcloud auth print-identity-token)\" \\\n",
    "-H \"Content-Type:application/json\" \\\n",
    "-d '{\"V1\": [-0.906611], \"V2\": [-0.906611], \"V3\": [-0.906611], \"V4\": [-0.906611], \"V5\": [-0.906611], \"V6\": [-0.906611], \"V7\": [-0.906611], \"V8\": [-0.906611], \"V9\": [-0.906611], \"V10\": [-0.906611], \"V11\": [-0.906611], \"V12\": [-0.906611], \"V13\": [-0.906611], \"V14\": [-0.906611], \"V15\": [-0.906611], \"V16\": [-0.906611], \"V17\": [-0.906611], \"V18\": [-0.906611], \"V19\": [-0.906611], \"V20\": [-0.906611], \"V21\": [-0.906611], \"V22\": [-0.906611], \"V23\": [-0.906611], \"V24\": [-0.906611], \"V25\": [-0.906611], \"V26\": [-0.906611], \"Amount\": [15.99], \"userid\": \"user99\"}'\n",
    "```\n",
    "\n",
    "You deploy a VM in the same VPC and run the command from there.\n",
    "\n",
    "Perhaps an easier way to test is test it from the Testing tab on the [web Console](https://console.cloud.google.com/functions/list)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4264177a-3f8f-4a44-87b5-41f20245d7a5",
   "metadata": {},
   "source": [
    "## Conclusion\n",
    "\n",
    "We have defined a Vertex AI Pipeline to train a model using TFX. The model uses credit card transaction data where some of the transactions are labeled fraudulent and it predicts whethers transactions are fraudulent.\n",
    "\n",
    "We have defined and deployed a continuous training pipeline, allowing the model to be retrained by sending a message to a Pub/Sub topic.\n",
    "\n",
    "We have deployed a Feature Store and loaded it with data. We have deployed a Vertex AI Endpoint to host the model.\n",
    "\n",
    "We have defined and deployed a model deployment pipeline, that tests the trained model, deploys it to the Endpoint nad tests the Endpoint.\n",
    "\n",
    "Finally we have deployed a Cloud Function that can serve as the final prediction API, which retrieves some features from the Feature Store, uses these to feed the model and returns the prediction."
   ]
  }
 ],
 "metadata": {
  "environment": {
   "kernel": "python3",
   "name": "tf2-gpu.2-8.m93",
   "type": "gcloud",
   "uri": "gcr.io/deeplearning-platform-release/tf2-gpu.2-8:m93"
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
